659 research outputs found

    Multiple structural alignment for distantly related all b structures using TOPS pattern discovery and simulated annealing

    Get PDF
    Topsalign is a method that will structurally align diverse protein structures, for example, structural alignment of protein superfolds. All proteins within a superfold share the same fold but often have very low sequence identity and different biological and biochemical functions. There is often signi®cant structural diversity around the common scaffold of secondary structure elements of the fold. Topsalign uses topological descriptions of proteins. A pattern discovery algorithm identi®es equivalent secondary structure elements between a set of proteins and these are used to produce an initial multiple structure alignment. Simulated annealing is used to optimize the alignment. The output of Topsalign is a multiple structure-based sequence alignment and a 3D superposition of the structures. This method has been tested on three superfolds: the b jelly roll, TIM (a/b) barrel and the OB fold. Topsalign outperforms established methods on very diverse structures. Despite the pattern discovery working only on b strand secondary structure elements, Topsalign is shown to align TIM (a/b) barrel superfamilies, which contain both a helices and b strands

    Continuous automata: bridging the gap between discrete and continuous time system models

    Get PDF
    The principled use of models in design and maintenance of a system is fundamental to the engineering methodology. As the complexity and sophistication of systems increase so do the demands on the system models required to design them. In particular the design of agent systems situated in the real world, such as robots, will require design models capable of expressing discrete and continuous changes of system parameters. Such systems are referred to as mode-switching or hybrid systems.This thesis investigates ways in which time is represented in automata system models with discretely and continuously changing parameters. Existing automaton approaches to hybrid modelling rely on describing continuous change at a sequence of points in time. In such approaches the time that elapses between each point is chosen non- deterministically in order to ensure that the model does not over-step a discrete change. In contrast, the new approach this thesis proposes describes continuous change by a continuum of points which can naturally and deterministically capture such change. As well as defining the semantics of individual models the nature of the temporal representation is particularly important in defining the composition of modular com­ponents. This new approach leads to a clear compositional semantics based on the synchronization of input and output values.The main contribution of this work is the derivation of a limiting process which provides a theoretical foundation for this new approach. It not only provides a link between dis­crete and continuous time representations, but also provides a basis for deciding which continuous time representations are theoretically sound. The resulting formalism, the Continuous I/O machine, is demonstrated to be comparable to Hybrid Automata in expressibility, but its representation of time gives it a much stronger compositional semantics based on the discrete synchronous machines from which it is derived.TThe conclusion of this work is that it is possible to define an automaton model that describes a continuum of events and that this can be effectively used to model complete mode-switching physical systems in a modular fashion

    Pattern matching and pattern discovery algorithms for protein topologies

    Get PDF
    We describe algorithms for pattern matching and pattern learning in TOPS diagrams (formal descriptions of protein topologies). These problems can be reduced to checking for subgraph isomorphism and finding maximal common subgraphs in a restricted class of ordered graphs. We have developed a subgraph isomorphism algorithm for ordered graphs, which performs well on the given set of data. The maximal common subgraph problem then is solved by repeated subgraph extension and checking for isomorphisms. Despite the apparent inefficiency such approach gives an algorithm with time complexity proportional to the number of graphs in the input set and is still practical on the given set of data. As a result we obtain fast methods which can be used for building a database of protein topological motifs, and for the comparison of a given protein of known secondary structure against a motif database

    “The great source” microplastic abundance and characteristics along the river Thames

    Get PDF
    This study focused on quantifying the abundance of microplastics within the surface water of the River Thames, UK. Ten sites in eight areas were sampled within the tidal Thames, starting from Teddington and ending at Southend-on-Sea. Three litres of water was collected monthly at high tide from land-based structures from each site from May 2019 to May 2021. Samples underwent visual analysis for microplastics categorised based on type, colour and size. 1041 pieces were tested using Fourier transform spectroscopy to identify chemical composition and polymer type. 6401 pieces of MP were found during sampling with an average MP of 12.27 pieces L⁻¹ along the river Thames. Results from this study show that microplastic abundance does not increase along the river

    Bayesian refinement of protein functional site matching

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold <it>a priori </it>according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varying evolutionary distances and crystallographic precision. Furthermore, sometimes the graph method is unable to identify alternative but important solutions in the neighbourhood of the distance based solution because of strict distance constraints. We consider the Bayesian approach to improve graph based solutions. In principle this approach applies to other methods with strict distance matching constraints. The Bayesian method can flexibly incorporate all types of prior information on specific binding sites (e.g. amino acid types) in contrast to combinatorial formulations.</p> <p>Results</p> <p>We present a new meta-algorithm for matching protein functional sites (active sites and ligand binding sites) based on an initial graph matching followed by refinement using a Markov chain Monte Carlo (MCMC) procedure. This procedure is an innovative extension to our recent work. The method accounts for the 3-dimensional structure of the site as well as the physico-chemical properties of the constituent amino acids. The MCMC procedure can lead to a significant increase in the number of significant matches compared to the graph method as measured independently by rigorously derived p-values.</p> <p>Conclusion</p> <p>MCMC refinement step is able to significantly improve graph based matches. We apply the method to matching NAD(P)(H) binding sites within single Rossmann fold families, between different families in the same superfamily, and in different folds. Within families sites are often well conserved, but there are examples where significant shape based matches do not retain similar amino acid chemistry, indicating that even within families the same ligand may be bound using substantially different physico-chemistry. We also show that the procedure finds significant matches between binding sites for the same co-factor in different families and different folds.</p

    Prognostic microRNAs in high-grade glioma reveal a link to oligodendrocyte precursor differentiation.

    Get PDF
    MicroRNA expression can be exploited to define tumor prognosis and stratification for precision medicine. It remains unclear whether prognostic microRNA signatures are exclusively tumor grade and/or molecular subtype-specific, or whether common signatures of aggressive clinical behavior can be identified. Here, we defined microRNAs that are associated with good and poor prognosis in grade III and IV gliomas using data from The Cancer Genome Atlas. Pathway analysis of microRNA targets that are differentially expressed in good and poor prognosis glioma identified a link to oligodendrocyte development. Notably, a microRNA expression profile that is characteristic of a specific oligodendrocyte precursor cell type (OP1) correlates with microRNA expression from 597 of these tumors and is consistently associated with poor patient outcome in grade III and IV gliomas. Our study reveals grade-independent and subtype-independent prognostic molecular signatures in high-grade glioma and provides a framework for investigating the mechanisms of brain tumor aggressiveness

    An optimized TOPS+ comparison method for enhanced TOPS models

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.Background Although methods based on highly abstract descriptions of protein structures, such as VAST and TOPS, can perform very fast protein structure comparison, the results can lack a high degree of biological significance. Previously we have discussed the basic mechanisms of our novel method for structure comparison based on our TOPS+ model (Topological descriptions of Protein Structures Enhanced with Ligand Information). In this paper we show how these results can be significantly improved using parameter optimization, and we call the resulting optimised TOPS+ method as advanced TOPS+ comparison method i.e. advTOPS+. Results We have developed a TOPS+ string model as an improvement to the TOPS [1-3] graph model by considering loops as secondary structure elements (SSEs) in addition to helices and strands, representing ligands as first class objects, and describing interactions between SSEs, and SSEs and ligands, by incoming and outgoing arcs, annotating SSEs with the interaction direction and type. Benchmarking results of an all-against-all pairwise comparison using a large dataset of 2,620 non-redundant structures from the PDB40 dataset [4] demonstrate the biological significance, in terms of SCOP classification at the superfamily level, of our TOPS+ comparison method. Conclusions Our advanced TOPS+ comparison shows better performance on the PDB40 dataset [4] compared to our basic TOPS+ method, giving 90 percent accuracy for SCOP alpha+beta; a 6 percent increase in accuracy compared to the TOPS and basic TOPS+ methods. It also outperforms the TOPS, basic TOPS+ and SSAP comparison methods on the Chew-Kedem dataset [5], achieving 98 percent accuracy. Software Availability: The TOPS+ comparison server is available at http://balabio.dcs.gla.ac.uk/mallika/WebTOPS/.This article is available through the Brunel Open Access Publishing Fun
    corecore